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1.  INTRODUCTION 


The  object  of  this  paper  ii  to  aurvey  rer.  cnt  results  in  the  statisti¬ 
cal  analysis  of  univariate  point  processes  (series  of  events).  The  survey 
is  personal  reflecting  my  own  present  interests,  and  is  not  comprehensive. 
For  convenience,  I  have  taken  "recent"  to  mean  anything  published  since 
Cox  and  Lewis  (1966).  I  have  also  mentioned  Pome  topics  and  results 
which  wei  e  omitted,  for  reasons  of  emphasis  or  ignorance,  from  that 
monograp'.!.  Finally,  I  point  out  areas  where  further  work  is  required. 

A  survey  of  work  in  this  field  is  not  so  difficult  as  a  survey  of, 
say,  the  theory  of  point  processes  because  the  advances  in  the  statisti¬ 
cal  analysis  of  point  processes  are,  by  comparison  with  the  theory,  few. 
This  may  reflect  the  relative  difficulties  of  these  areas,  but  that  thought 
may  be  a  personal  bias. 

Some  of  the  shortcomings  of  the  Cox  and  Lewis  monograph  were, 
in  hindsight,  the  following: 

(i)  Not  enough  consideration  of  grouped  data.  In  some  cases  where 
point  events  occur,  such  as  epidemiology,  recording  limitations  or  the 
volume  of  data  force  one  to  work  with  numbers  of  events  in  fixed  inter¬ 
vals  whose  width  may  or  may  not  be  controllable  in  advance.  I  have 


touched  on  this  problem  recently  (Lewis,  1970)j  separate  epectral 


•  nalyaei  of  the  Intervale  and  of  the  counting  proceee  (Bartlett,  1963) 
are,  for  example,  not  poealble.  One  hae  to  use  a  epectral  analyeie  of 
the  grouped  counte  (1.  e.  ,  number  of  evente  in  a  fixed  time  Interval), 
which  ia  very  nonatandard  in  ita  diatribution  theory,  but  in  which  well 
known  problema  of  aliaaing  ariae.  Reaulta  cf  Cox  (  1970)  may  be  uaeful 
here,  and  Cox  (195S)  has  considered  some  other  aspects  of  the  analysis 
of  grouped  events. 

Interesting  examples  of  this  type  of  problem  are  found  in  physics 
and  optics.  For  example  (Helstrom,  1964;  Karp  and  Clark,  1970) 
photon  or  other  particle  emissions  are  known  from  physical  consider** 
atbns  to  be  generated  by  a  doubly  stochastic  Poisson  process  and  it  is 
required  to  determine  parameter  values  of  the  driving  process  from 
counts  of  the  number  of  photons  emitted  in  successive  periods.  Here 
it  is  prohibitively  costly  to  record  exact  times  of  occurrence.  However, 
the  recording  interval  can  be  determined  in  advance  by  the  experimenter. 
Problems  of  grouped  Poisson  counts  (McNeill,  1971)  are  also  common. 

(ii)  Very  little  emphasis  was  given  to  sequential  methods.  While  this 
was  to  deliberate  to  save  space,  it  is  also  true  that  most  data  I  have  come 
across  is  presented  for  analysis  in  toto.  This  may  change  as  better  re¬ 
cording  methods  are  introduced  and,  hopefully,  as  statisticians  are  called 
in  before  the  fact.  There  is  still  a  problem  in  that  simple  sequential 
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methods  are  known  only  for  Poisson  processes  (homogeneous  or  non- 
homogeneous)  and  also  that  rather  unsmooth  inhomogeneities  occur  in 
practice  which  make  the  application  of  formal  sequential  methods  based 
on  very  definite  assumptions  quite  hazardous. 

Less  formal  sequential  methods  are  useful;  in  particular,  an 
analysis  of  the  data  in  successive  sections  is  very  useful,  both  to  cut 
down  on  computation  time  in,  say,  spectral  analyses  (Lewis,  1970) 
and  to  examine  the  time  evolution  of  the  process. 

As  an  example,  consider  a  series  of  arrivals  at  an  intensive 
care  unit  in  a  hospital.  This  data  will  be  used  for  illustration  through¬ 
out  the  paper;  it  was  supplied  by  Dr.  A,  Barr,  of  the  Oxford  Region- 
al  Hospital  Board,  England.  The  first  r-ction,  consisting  of 
nx  251  arrivals  in  t^  *  409  days  (4  February,  1963  -  18  March,  1964), 
was  analyzed  in  Chapter  1  of  Cox  and  Lewis  (196  6).  Later  on,  the 
arrivals  up  to  6  February,  1968  were  received.  Three  subsequent 
sections  of  length  t^  *  409  days  were  taken  from  these  later  arrivals 
for  comparison  and  their  statistics,  as  well  as  those  for  the  total 
record,  are  shown  in  Table  1. 

♦  These  times-of-arrival  were  exceptionally  well  recorded.  Of  the 
1468  arrivals  in  the  1420-day  period  from  4  February,  1963  to 
6  February,  1968,  only  one  time-of-day  was  not  recorded.  Nine  tied 
arrivals  occurred.  Generally,  recording  times  seemed  to  be  at  the 
five-minute  intervals  of  the  hour,  although  other  times  do  occur.  The 
data  to  18  March,  1964  is  given  in  Cox  and  Lewis  (1966,  p.  255);  the 
rest  In  Lewis  (1971). 


Table  1,  Arrivala  at  an  intensive  care  unit  (no  ties) 


Period  1 

Period  2 

Period  3 

Period  4 

Total 

4  Feb  'sa¬ 

19  Mar  '64- 

3  May  '65- 

16  June  '66- 

4  Feb  '63- 

ls  Mar  >64 

2  May  '65 

15  June  '66 

29  July  '67 

6  Feb  '68 

n  251 

350 

372 

316 

1458 

‘o  409 

409 

409 

409 

1829 

iXdays)  218.47 

218.  53 

183.  69 

197.  20 

954.  25 

U  1.  874 

2.  222 

-3.  399 

-1.  099 

2.  874 

m«N/to  0.614 

0.  856 

0.  910 

0.  773 

0.  797 

n 

The  ■tatietlc  U  =  S  t. /n,  where  the  t,  are  the  timei  to  events  and 

i-1  i  i 

n  the  number  of  events  in  the  period  of  observation  of  length  is  the 

centroid  of  the  t^'s  and  is  used  to  test  (Cox  and  Lewis,  1966,  p.  47) 
for  the  trend  parameter  p  in  a  nonhomogeneous  Poisson  process  with 
rate 


X  (t)  ■  txpia  +  pt)»  X  exp(pt).  (1.  1) 

In  testing  for  p  ■  0  against  p  ft  0,  a  is  a  nuisance  parameter  with 
sufficient  statistic  n  and  the  test  is  based  on  the  distribution  of  U, 
given  n.  The  normalized  statistic  U  in  line  three  of  Table  1  should 
be  distributed  approximately  as  a  N(0,  1)  variable.  There  was  fairly 
strong  evidence  of  a  monotone  trend  at  the  end  of  the  first  observation 
period  (  P*  7  percent  level),  but  later  os  in  the  series,  as  in  Section  3, 
there  is  a  definite  decreasing  trend.  However,  the  total  arrival  process 


gives  fairly  strong  evidence  of  an  increase  in  X  (t),  the  value  U  ■  2.  874 


being  ■ignificant  at  about  a  2  percent  level.  For  Period  1  and  Period  2  combined, 

U  was  found  to  be  4.492  which  la  significant  at  a  level  much  smaller  than  1  percent. 
An  actual  sequential  test  of  p  >  0  against  p  ft  0  rejects  the  null  hypo¬ 
thesis  at  a  1  percent  level  after  550  days.  The  sectioned  analysis  of 
four  periods  suggests  a  long  cycle  or  quadratic  term  in  the  exponential 
trend  (1.  1).  The  nonhomogeneity  is  confirmed  very  strongly  by  a  dis¬ 
persion  test  (Cox  and  Lewis,  1966,  p.  232)  applied  to  the  numbers  of 
events  in  the  four  sections.  This  has  value  26.74;  its  distribution  is 
that  of  a  chi-square  variable  with  3  degree*  of  freedom  which  has  a 
0.  9999  point  equal  to  21. 11. 

A  plot  of  the  times -to-events  t^  against  serial  nxunber  i  is 
given  in  Figure  1,  and  a  plot  of  the  average  values  of  successive  sets 
of  twenty  intervals  between  arrivals  is  shown  in  Figure  2.  Again  the 
long  cycle  or  quadratic  trend  is  quite  evident  graphically.  We  return 
to  this  example  later  in  the  paper. 

(iii)  Sectioning  brings  up  in  some  ways  the  analysis  of  replicated 
point  processes,  which  again  was  not  considered  in  detail  by  Cox 
and  Lewis  (1966).  This  replication  can  occur  quite  commonly  in  exper¬ 
imental  situations,  for  instance  in  neurophysiology  where  experiments 
can  be  repeated  many  times.  Here  the  signals  are  trains  of  very  narrow 
spikes  of  apparently  fixed  height,  so  it  is  appropriate  to  analyze  the 
times  of  occurrence  of  the  spikes  as  a  point  process.  Observation  over 


ARRIVAL  NUMBER 
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VERAGE  OF  SUCCESSIVE  GROUPS 
INTER-ARRIVAL  TIMES 


8 


too  long  ■*.  period  can  be  deceiving,  because  of  physiological  deterioration, 
and  replication  is  sometinnes  preferable.  Of  course,  care  must  still  be 
taken  that  physiological  or  experimental  conditions  have  not  changed. 

Comparison  of  rates  and  trends,  mainly  in  Poisson  processes, 
was  discussed  in  Cox  and  Lewis  (1966).  See  also  Qureishi  (1964)  for 
the  comparison  of  rates  in  twoWeibxill  processes.  General  problems 
of  multiple  point  processes  (multivariate  point  processes)  are  discussed 
in  Cox  and  Lewis  (1972)  and  Perkel,  Gerstein  and  Moore  (1967b),  but 
are  beyond  fhe  scope  of  this  paper.  Lewis  (197  0)  discussed  estimates 
of  the  spectrum  of  counts  from  sectioned  data,  as  well  as  pooling  and 
comparison  of  the  spect^^a. 

(iv)  Although  trend  analysis  was  discussed  in  Chapter  3  of  Cox 
and  Lewis  (1966),  I  don't  believe  it  received  enough  emphasis. 

Also,  problems  such  as  the  analysis  of  cyclic  trends  were  barely 
touched  upon.  These  and  other  problems  in  trend  analysis  are 
discussed  in  more  c’atail  later  in  the  paper. 


(v)  Finally,  perhaps  more  emphasis  could  have  been  given  to  graph¬ 
ical  methods.  These  were  discussed  in  Chapter  1,  but  no  reference  was 
made  to  probability  plotting,  for  example,  in  later  chapters.  See,  for 
example,  Wilk  and  Gnanadesikan  (196  8). 
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The  field  where  moet  use  has  been  made  of  techniques  for  the 
analysis  of  point  processes  is  in  neurophysiological  work  on  the  signals 

occurring  on  nerve  fibers.  A  summary  of  techniques  for  this  type  of 

I 

analysit  given  in  Perkel,  Gerstein  and  Moore  (1967a,  1967b)  largely 
parallels  Cox  and  Lewis  (1966),  with  more  information  on  the  special 
problems  of  neurophysiology.  For  later  work  and  applications,  see, 
for  example,  Moore,  Segundo,  Perkel  and  Levitan  (197  0)  and  the 
references  given  in  that  paper.  It  would  be  impossible  to  try  to  sum- 
mari'^e  all  of  this  work  here;  some  of  it  will  be  touched  on  later,  but 
there  is  virtually  no  statistical  methodology  given  in  them  which  is  not 
given  in  Cox  and  Lewis  (1966). 

I 

Another  interest'  ^  field  of  application  is  to  the  study  of  the 
occurrence  of  earthquakes.  While  strictly  not  a  univariate  point  pro¬ 
cess,  an  approximation  to  the  earthquake  process  as  a  univariate  point 
process  yields  useful  insights.  For  such  analyses,  see  Vere-Jones, 
Turnovsky  and  Eiiry  (1964),  Vere-Jones  and  Davies  (1966),  Vere-Jones 
(1970),  and  Shlien  and  Toksoz  (1  970).  The  discussion  of  Vsrs-Jonss  (1970) 
contains  extensive  comments  on  the  problem  of  analyzing  earthquake  oc¬ 
currence  data. 

Computational  problems  in  the  spectral  analysis  of  point  processes 
have  been  discussed  by  Lewis  (1970).  By  spectral  analysis  here  and  in 
the  rest  of  the  paper  we  mean  Bartlett's  spectral  analysis  of  the  counting 
proces8^N(t)  of  a  point  proces s  (Bartlett,  1963;  Cox  and  Lewis,  1966, 


\ 
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1  . 

Chapter  5).  T'le  apectral  analyiii  of  the  intervals  between  events  in  a 
point  process  has  been  greatly  facilitated  by  the  availability  of  the  fast 
Fou’-ier  transform  algorithm;  see  Cooley  and  Tukey  (1965)  and  Cooley, 
Lewis  and  Welch  (1970)  for  details. 

Most  of  the  statistical  methodology  for  the  analysis  of  univariate 
series  of  events  given  in  Cox  and  Lewis  (1966)  has  been  implemented  in 
a  computer  program  called  SASE  IV.  For  details,  see  Lewis  (1966) 
and  Lewis,  Katcher  and  Weiss  (1969).  SASE  IV  is  a  large  FORTRAN 
program  with  graphical  output  which  is  available;  from  the  IBM  Program 
Information  Department,  40  Saw  Mill  River  Road,  Hawthorne,  New  York 
10532,  as  Program  No.  360  L  130001. 

In  subsequent  sections  of  this  paper  we  discuss  first  two  central 
problems  in  the  statistical  analysis  of  point  processes,  namely  tests 
for  renewal  processes  and  tests  for  Poisson  processes.  Then  techniques 
for  specific  non- renewal  processes,  such  as  doubly  stochastic  and  cluster¬ 
ing  Poisson  processes  are  discussed.  The  inability  to  write  down  a 
likelihood  function  makes  the  formal  analysis  of  non- renewal  point  pro¬ 
cesses  difficult;  it  is  only  for  the  important  doubly  stochastic  Poisson 
process  that  techniques  are  beginning  to  appear. 

The  final  ssetions  deal  with  trend  analyses,  mostly  of  non- 
hompgeneous  Poisson  processes;  first  we  consider  the  case  of  monotone 
trends,  then  the  case  of  cyclic  trends  and  their  relationship  to  the 
spectral  analysis  of  point  processes.  Following  this,  some  general 


problems  of  trend  analysis  are  considered;  these  include  tests  for  particular 


rate  functions  and  the  nonhomo  gene  ous  Poisson  process  model  per  se,  and 


the  definition  of  "residuals"  in  the  analysis  of  nonhomogeneous  Poisson 


models 


2. 


TESTS  FOR  RENEWAL  PROCESSES 


Ic 

2. 1  Mt^rkov  interval  proceiiea  and  aerial  correlation  coefficients. 

A  natural  extension  of  the  renewal  process  model  is  to  processes 
with  first  order  dependence  of  intervals  (Wold,  1948;  Cox,  1955).  The 
naturalness  may  be  only  mathematical  since  point  processes 
of  this  type  have  not  been  commonly  found  in  applications,  although 
they  have  recently  been  postulated  in  neurophysiological  contexts 
(Lampard,  1968;  Walle,  et  al,  1969).  In  the  process  of  Walloe 
et  al  (1969),  the  first  order  Markov  interval  property  depends 
on  the  input  to  a  neuron  being  a  Poisson  process.  However,  since  the 
input  is  the  superposition  of  an  unknown  number  of  fairly  regular  neuron 
signals,  the  hypothesis  is  tenuous.  A  drawback  in  the  use  of  the  first 
order  Markov  interval  process  as  an  approximate  model  for  serial 
dependence  in  series  of  events  is  the  difficulty  of  obtaining  analytical 
results  on,  for  example,  the  spectrum  of  counts  or  the  variancftime  curve. 
This  difficulty  is  closely  related  to  the  fact  that  there  are  no  regenera¬ 
tion  points  in  the  process. 

Another  problem  with  the  model  is  the  dearth  (until  recently) 
of  useful  bivariate  distribution  models.  Cox  (1955)  used  a  bivariate 
exponential  of  the  form 

f  (x;  X,  =  y)  =  \  (y)  exp{-\  (y)x>,  (2. 1) 

i+1 
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where 


\  (y)  .  X  q(1  +  x)  >  0. 


which  had  the  drawback  of  nonlinear  restrictions  on  the  parameter. 
Another  widely  used  model  for  bivariate  gamma  distributions  which 
arises  quite  naturally  has  been  discussed  by  Moran  (1967a),  Vere-Jones 
(1967),  Lampard  (1968),  and  Caver  (1970).  (See  also  Griffiths,  1969, 
for  general  properties  of  bivariate  gamma  distributions. )  These  bi¬ 
variate  distributions  all  have  positive  correlations.  Bivariate  exponen¬ 
tials  with  negative  correlation  have  been  derived  by  Caver  (1972)  and 
bivariate,  negatively  correlated  intervals  have  been  observed  in  a 
neurophysiological  context  by  Walloe  et  al  (1969). 

For  statistical  analysis,  one  of  the  most  important  theoretical 
results  on  bivariate  exponential  distributions  is  that  of  Moran  (1967a), 
who  showei  from  more  general  resuLt  that  the  serial  correlation 
':oefficient  (or  order  1)  for  bivariate  exponential  distributions  is 
always  greater  than  -0.6649; 


E{(X^  -  E(X))(X^^j  -  E(X)} 


var  (X^) 


1  -  r  *  -0.6649.  (2.  2) 

o 


Estimates  of  the  serial  correlation  and  tests  of  the  renewal 
hypothesis  against  a  first  order  Markov  interval  process  are  based  on 
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the  iample  aerial  correlation  coefficient  p^,  defined  as  follows. 

For  simplicity,  assume  observation  starts  with  an  event  and 

ends  with  an  event,  there  being  n  observed  intervals  between  events 

x,,x,  ...,x  with  mean  x«n“^Z:x,.  Let  z.  =  x.  -  x ,  Then 
1  £  n  ill 


n-1 

£  z,  z,  , 
,  ,  i  i+1 
n  isl 


^1  “  n-1  n 


(2.3) 


r  z. 
i*l  ^ 


Before  discussing  theoretical  results  on  tests  for  renewal  pro- 

'*>W 

cesses  based  on  p^,  consider  its  distribution  for  renewal  processes, 
a  great  deal  about  which  is  now  known. 

The  expected  value  of  under  the  renewal  hypothesis,  for  any 
distribution  of  the  Intervals  x^,  is  (Moran,  1967b) 


E(Pl)  - 


1 

n-1  ’ 


(2.4) 


1 

and  while  its  variance  ia  known  to  be  asymptotically  n  if  p  k  0,  the 
fc.-:act  variance  depends  on  the  distributions  of  the  x^  (Moran,  1967b): 


var(pj) 


2 

n  -  n  4- 1 
n^  (n-1) 


Ez 

n4l  ,  i  .  ^  n-2 

n(n-l)  \  2  2  ^  n(n-l)  ’ 


(2.5) 


Moran  (1967b)  obtained  an  approximation  by  replacing  the  expectation 
by  the  ratio  of  the  expectations  of  the  numerator  and  denominator.  This 
result  is  exact  for  normally  distributed  x^'s,  giving 
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var(p) 


n^(n-l) 


(2.6) 


For  exponentially  diitributed  x  ,  the  Moran  approximation  givei 


var{pj)  ~  - 


1_  +  ^  12® 

2  3"  4 

n  n  n 


(2.7) 


Moran  (197  0)  compared  hit  approximation  to  aampling  retultt  obtained 
by  Cox  (1967)  and  a  brief  eummary  of  their  retulte  on  the  variances 
follows. 


(a)  The  variance  of  p^  tends  to  be  smaller  than  that  for  the  normal 
case  for  positive  random  variables  with  long  tails. 

(b)  Moran's  approximation  works  reasonably  well  with  gamma  dis¬ 
tributions  with  shape  parameter  k  ^  0,  n  ^  50,  or  k  ^  8,  n  ^  10. 
Outside  these  limits  the  approximation  underestimates  the  variance 
(k  =  0  is  the  exponential  distribution). 


(c)  For  the  exponential  distribution,  a  partial  reconstruction  of 
Table  1  from  Moran  (197  0)  using  sampling  results  from  Goodman 
aud  Lewis  (1972)  follows. 
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Table  2.  Variance  of  in  exponentially  diitributed  random  samplee. 


Observed 

Sample  variance  Moran's  Difference/ 

size  n  (simulation)  approximation  Difference  Observed 


0.  0419 
0.  0182 
0.  00945 
0.  00219 


0.  0371 
0.  0176 
0.  00935 
0.  00221 


0.  0048 
0.  0010 
0.  00010 
0.  00002  0 


.  115 
.  055 
.  Oil 
.  009 


(d)  For  very  skewed  distributions,  a  better  approximation  to  var(pj^) 
is  needed.  This  is  apparent  from  Table  3  for  random  samples  where 
the  have  a  Weibull  distribution  with  shape  parameter  ^  ^l*baU) 

A  random  variable  with  this  Weibull  distribution  is  the  square  of  a  random 
variable  with  a  unit  exponential  distribution. 

Table  3.  Variance  of  in  1/2  Weibull  random  samples 


Sample 
size  n 

Observed 
variance 
(k  mulation) 

Moran's 

approximation 

Difference 

Differenc 

Observed 

50 

0.  01410 

0.  00058 

0.  01352 

.96 

100 

0.  007  82 

0.  00054 

0.  0072  8 

.  93 

450 

0,  00201 

0.  00047 

0.  00154 

.  77 

Returning  to  the  distribution  of  p^,  Moran  (197  0b)  showed  that 

l/2~ 

n  pj^  converges  in  distribution  to  a  unit  normal  variate  if  the  first 
four  moments  of  the  x^  exist.  Cox  (1966)  examined  the  distribution 
by  synthetic  sampling  for  the  case  where  the  x.  had  a  normal  distri¬ 
bution,  and  also  distributions  of  the  form 
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1 


-X 


I  i 


m  f 


r{k+i) 


(X  =  0). 


(2.  8) 


with  k=  24,  8,  1,  0,  -  1/2,  a  rectangular  distribution,  a  double 
exponential  distribution  and  a  Cauchy  distribution.  The  first  two 
moments  and  coefficients  of  skewness  and  kurtosis  were  examined. 

For  the  positive  random  variables,  the  distributions  were  generally 
positively  skewed  with  convergence  to  normality  apparently  fairly 
rapid. 

These  results  have  been  extended  for  the  case  of  x^  exponen¬ 
tially  distributed  in  large  scale  simulations  reported  in  Goodman  and 
Lewis  (1972). 

Figure  3  shows  a  c  imputer  plot  of  the  estimated  mean  (♦), 
standard  deviation  (x)  and  coefficients  of  skewness  (+)  and  kurtosis 
(X)  of  pj  as  a  function  of  n  (RHO(l,  N))  for  n  ■  11(1)  120,  130,  140, 
150.  The  simulations  involved  at  least  3,  000,  000  replications  for 
each  n,  so  that  the  sampling  variances  of  the  estimates  are  small. 

The  curves  have  not  been  smoothed.  Note  that  the  skewness  is  positive, 
with  a  maximum  value  of  about  0.  32  at  n  x  35,  after  which  it  starts 
back  toward  its  asymptotic  value  of  zero.  The  l.urtosis  is  small, 
going  from  positive  to  negative  at  about  n  *  35  and  then  going  back 
toward  its  asymptotic  value  of  zero  very  slowly. 

The  departure  from  normality  (positive  skewness)  is  seen 
much  more  clearly  in  Figure  4,  where  the  computer  plot  gives  esti- 
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mated  normalized  values  of,  from  top  to  bottom,  the  quantiles  at 
levels  0.001,  0.002,  0.005,  C.  010,  0.020,  0.025,  0.050,  0.100, 
the  estimated  mean  plus  (n-1)  \  and  the  quantiles  at 
levels  0.900,  0.950,  0.97  5,  0.980,  0.990,  0.995,  0.998,  0.999 
These  should  converge  to  the  corresponding  quantiles 
of  the  unit  normal  distribution. 

Note  two  things:  the  departures  from  normality  are  relatively 
small  for  the  inner  quantiles  and  convergence  to  the  normal  quantiles 
is  very  slow. 

More  detailed  results  from  which  the  departure  from  normality 
and  the  slow  rate  of  convergence  can  be  assessed  are  shown  in  Table  4. 
The  quantiles  and  moments  of  (called  RHO(l,  N)  in  the  computer) 
for  n  *  450  are  shown  in  the  rows  m  arked  exponential.  Of  these 
rows,  the  first  row  is  the  actual  estimated  value  of  the  moment  or 
quantile,  the  bracketed  quantities  in  the  next  row  are  the  estimated 
sampling  standard  deviations  of  the  estimates  (5  degrees  of  freedom), 
and  the  third  row  is  the  quantile  minus  pL,  all  divided  by  o .  This 
row  illustrates  the  departure  from  normality,  x^  ^  being  2.  039 
instead  of  1.  96  0. 

/N,/ 

Much  more  serious  departures  occur  for  the  quantiles  of 
from  random  samples  with  1/2  Weibull  intervals.  These  are  given 
in  the  rows  marked  "1/2  Weibull"  in  Table  4.  Thus,  not  only  is 
Moran's  approximation  poor,  as  seen  above,  but  the  normal  approxi- 
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mation  is  considerably  off.  The  simulation  results  for  the  1/2  Weibull 
case  are  discussed  more  fully  in  Goodman  and  Lewis  (1972). 

Moran  (1967b)  showed  that  tests  for  independence  against  first  order 
serially  correlated  intervals  with  exponentially  distributed  marginals 
based  on  are  asymptotically  most  powerful.  He  also  conjectured 
that  this  is  true  for  Gamma  distributed  intervals.  More  formal  results 
have  been  reported  fy  Yang  (197  0). 

Moran  (1967b)  also  proposed  two  modifications  of  the  serial  corre¬ 
lation  coefficient.  One  is  obtained  (see  also  Cox,  1955)  by  replacing  the 
estimate  of  the  variance  in  the  denominator  of  (2.  3)  by  the  square  of  the 
estimate  of  the  mean  (the  mean  and  the  standard  deviation  are  equal  for  the 
exponential  distribution).  The  other  modification  is  due  to  Ogawara, 

Moran  (1967b,  1970)  gives  the  first  two  moments  of  these  test  statistics 
and  shows  that  asymptotically  they  involve  no  loss  of  efficiency  against 
the  first-order  Markov  interval  process.  However,  Moran's  conjec¬ 
ture  that  the  distribution  of  Ogawara  statistic  converges  rapidly  to  the 
normal  is  not  correct  (Goodman  and  Lewis,  1972);  in  fact,  its  distri- 
bution  is  very  similar  to  that  of  in  the  null  exponential  case. 

An  interesting  application  of  serial  correlation  statistics  in 
examining  neurophysiological  models  is  given  by  Walloe  et  al  (l969). 

There  is  very  evident  negative  correlation  between  successive  inter¬ 
vals;  given  the  known  structure  of  the  neuronal  process,  first  order 
Markov  dependence  would  have  been  expected  if  the  input  to  the 
neuron  had  been  a  Poisson  process,  and  this  was  tested  by  checking 
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if  (pj)  ~  Unfortvmately,  {peraonal  communication)  waa  esti- 
mated  by  kveraging  together  eatimatea  obtained  from  successive  sections 
of  length  ?0,  and  the  bias  (2.  4)  accounts  for  a  large  part  of  the  observation 
tlWit  for  several  experiments  (p  was  systematically  significantly  larger 
than  p  .  A  jackknifed  estimate  (Ouenouille,  1956)  could  have  been  used, 

Cl 

or  a  correctlt>n  for  bias  introduced.  There  probably  still  is  real  higher 
order  depends  ace  in  this  data  accounted  for  by  the  fact  that  the  input  to  the 
neuron  is  a  superposition  of  a  finite  number  of  inputs  and  therefore  not  quite 
a  Poisson  process, 

2.2  Product  moment  score  statistics 

An  alternative  to  the  serial  correlation  coefficient  is  obtained 
(Cox  and  Lewis,  1966,  pp.  166-7)  by  replacing  the  actual  interval 
values  by  their  ranks  r^  or  exponential  scores  e(r^;n)  (Cox 
and  Lewis,  1966,  pp.  26-27),  and  computing  a  first  order  product 
moment  statistic 

n-i 

R  (n)  =  S  e(r  ;n)X  e(r  ;n),  (2.9) 

1  1  1+1 

A  test  for  serial  independence  is  then  based  on  the  null  distribution  of 
R^(n)  under  a  permutation  hypothesis. 

'  An  advantage  to  using  Rj(n)  is  that  it  controls  outliers  in  the 

\ 

series  of  evehts;  i.  e.  ,  missing  points.  Its  distribution  has  been  tabu- 
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lated  by  Lewis  and  Goodman  (L969,  197  0)  for  ranks,  exponential  scores, 
scores  from  gamma  populations  with  parameter  k  *  -  1/2  and  scores 
from  Weibull  populations  with  shape  parameter  1/2  (1/2  Weibull). 

I 

The  distributions  for  these  product  moment  rank  statistics  are 


the  equivalent  parent  population;  positively  skewed  and  very  slow  to 
converge  to  the  asymptotic  normal  distribution.  The  distribution  of  the 
normalized  exponential  score  product -morhent  statistic  does  in  fact 
converge  to  the  distribution  of  when  the  series  length  is  n  «  10,000; 
quantiles  are  shown  for  n  =  450  in  Table  5  and  should  be  compared 
with  the  exponential  values  in  Table  4.  ' 


Table  5. 

Normalized  exponential  score 

product' 

-moment 

statistic;  n 

=  450 

/%/ 

*0,  001 

*0.  002 

*0.  005 

*0.  010 

*0.  020 

*0.  025 

*0.  050 

*0.  100 

-2.799 

-2.623 

-2.370 

-2.  159 

-1.  929 

-1.  848 

-1.  567 

-1.240 

*0.  900 

’^O.  950 

*0.  975 

^0.  980 

*0.  990 

*0.  995 

*0.  998 

*0.999 

1.  270 

1.656 

1.  994 

2.  096 

2.  395 

2.669 

3.  009 

3.245 

The  sampling  error  in  the  values  in  Table  5  is  approximately  0.  001. 

1  I 

Note  that  the  idea  behind  these  tests  is  similar  to  that  for  the 
technique  of  random  shuffling  described  in  Perkel  et  al  (1967a)  and 
apparently  commonly  used  in  neurophysiological  work.^  The  use  of  the 


product -moment  statiatlc  ii  more  sophisticated  and  does  not  require  a 
computer  to  do  the  shxif fling. 


2.  3  Tests  of  serial  independence  based  on  the  periodogram 

The  first  aerial  correlation  coefficient  can  be  used  as  a  test 
for  serial  independence  for  alternatives  other  than  the  first  order 
Markov  interval  process,  but  has  certain  drawbacks.  In  particular, 
for  the  more  common  alternatives  such  as  clustering  processes,  there 
is  no  imperative  for  looking  at  just  the  first  serial  correlation  and  tests 
combining  aerial  correlations  of  several  orders  present  difficulties  be¬ 
cause  of  the  correlation  between  these  statistics. 

Tests  based  on  the  periodogram  were  advocated  by  Bartlett  (1954) 
and  described  in  Cox  and  Lewis  (1966,  pp  168-17C)and  Durbin  (1969). 

Denote  the  finite  Fourier  transform  of  the  n  intervals  x,,  .  .  .  ,  x  by 

1  n 


H  (u)  )  = 


(Zirn) 


172 


n  i»u 

S  X  e  ^ 

s=l  ■ 


and  the  periodogram  by 


(j  =  Zirp/n, 
P 


p  X  1,  2,  .  .  ,  [  1/2  n]  =  I.. 


yV'  '”n‘V 


The  test  is  essentially  for  a  "flat"  spectrum  using  the  distribution 


TABLE  6.  Maximum  Value  of  the  Periodogram 

Estimated  quantiles,  means,  standard  deviations  and  coefficients  of  skewness  and  kurtosis  of  the  maximum  value  of  the 
periodograms  of  series  of  N  =  450  independent  exponential  and  l/2-Weibull  variates.  Averaged  estimates  from  6  synthetic 
samples  each  with  750,000  trials.  Quantties  in  brackets  are  estimates  of  the  standard  deviation  of  the  estimates  (5  degrees 
of  freedom).  Also  exact  values  for  normally  distributed  time  series  for  comparison. 
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theory  for  the  I  (w  )'8.  The  theory  states  that  the  I  (w  )'s  are 
n  p  P 

approximately  independent,  exponentially  distributed  random  variables, 
the  result  being  exact  for  x^,  .  .  .  ,  independent  and  identically  normally 
di  sf  ributed. 

This  hypothesis  of  a  "flat"  spectrum  can  be  tested  using  a  modi¬ 
fied  h  omogeneity  of  variance  statistic,  or  the  cumulated  periodogram 
values 


I  t 

U  "  E  1  (w  )  /  r  1  (w  )  p  *  1.  3 _ /-I 

I  n  p  n  p 

can  b.  'ssisd  as  order  statistics  from  a  uJtlform  distribution.  Bartlett 
shewed  that  the  distribution  theory  is  essentially  independent  of 
the  third  cumulant  of  the  Intervals  between  events,  b.it  it  can  be  shown 
to  be  sensitive  to  k^,  the  fourth  cumulant. 

Again,  re  suits  have  been  obtained  by  synthetic  sampling  by 
Goodman  and  Lewis  (1972).  The dist ribution  of  the  maxlmuin  values 
of  the  periodogram  is  shown  in  Table  6,  the  values  for  normally  dis¬ 
tributed  x^'s  being  exact,  those  for  exponential  and  half  Weibull  being 
taken  from  a  large  simulation.  The  deviations  from  the  normal  case 
for  exponential  x^'s  is  small;  for  1/2  Weibull  x^'s,  which  have  a 
coefficient  of  kurtosis  of  84.  ^20,  the  departure  is  dramatic. 

In  Figure  5,  for  exponential  varieties,  we  give  a  computer  plot 


of  the  distribution  of  the  modified  Kolmogorov-Smirnov  statistic 


KSTWO(N)  =  ^y^l  C  =  max  |U  -  i/i  | 


in  the  form  of  computer  printed  plots  of  sixteen  quantiles  as  a  function 
of  n,  the  length  of  the  series,  (n  >  N  in  the  plot  and  the  statistic  ij 
called  KSCTWO(N))<  The  quantiles  are  a  little  smaller  than  for  the 
normal  case  in  which  the  upper  (asymptotic)  5  percent  and  1  percent 
points  are  1.  35  8  and  1.  628,  respectively.  Note  that  for  N  =  140  the 
statistic  is  based  on  I  =  69  U^^^'s,  so  that  the  convergence  of  the 
quantiles  is  faster  than  it  appears  to  be  in  the  figure,  although  not  as 
fast  as  in  the  normal  case.  Of  course,  it  is  not  known  that  the  dis¬ 
tribution  based  on  the  spectrum  for  non-normal  variates  converges  to 
the  distribution  for  normal  variates,  but  it  is  likely  if  the  first  four 
moments  exist. 
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3.  TESTS  FOR  POISSON  PROCESSES 

We  will  diocuss  here  primarily  tests  for  the  Poisson  hypothesis 

against  stationary  alternatives,  trends  being  discussed  in  Section  5.  We 

assume  as  before  that  n  events  are  observed  at  times  t  <  t,  <  .  .  .  <  t 

12  n 

in  the  fixed  interval  (0,  t^] .  Intervals  between  events  are  denoted  by 

X  ,  X,,  ....  X  ,  and  it  is  convenient  to  denote  the  residual  interval  at  the 
I  c  n 

end  of  the  period  of  observation  by  x  *  t„  -  t  . 

’  n+1  0  n 

There  are  four  major  categories  of  tests.  ' 


a)  Tests  based  on  the  t^'s,  conditional  upon  n  *  ^^q)i  which  is  a 
sufficient  statistic  for  the  nuisance  parameter  the  rate  in  a  homo¬ 

geneous  Poisson  process.  Since  the  t^'s  are  (conditionally)  the  order 
statistics  from  a  uniform  distribution,  the  empirical  count  function  N(t) 
(see  Figure  1)  is  proportional  to  the  empirical  cumulative  distribution 
function  for  the  uniformly  distributed  samples.  The  intervals  x^  are 
than  just  the  gap  statistics  for  the  sample. 

Tests  based  on  the  maximum  deviation  between  N(t)  and  t/t^ 
(Kolmogorov-Smirnov  statistics  and  modifications)  or  other  metrics 
(Anderson-Darling  statistic)  have  well  known  distribution  theories  (see, 
for  example,  Durbin,  1967),  but  are  sensitive  mainly  to  trend  departures 
from  the  homogeneous  Poisson  hypothesis.  In  fact,  Lewis  (1965)  showed 
for  the  special  case  of  gamma  renewal  alternatives,  that  the  test  is  not 
consistent.  Those  results  can  probably  be  extended  by  results  on 
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empirical  proceatee  for  renewal  procesaea  reported  in  Pyke  (1972). 

The  moat  important  thing  to  note  he‘re  ia  that  while  the  diatri- 
bution  theory  of  these  teata  ia  well  known  becauite  of  the  identity  with 
the  problem  of  testing, in  a  random  sample  of  size  n,  a  distribution 
FQ(t)  against  an  alternative  F^(t),  the  distribution  theory  under 
alternatives  is  completely  different.  In  the  one  case  F^^t^)  are  order 
statistics  from  nonuniform  random  samples;  for  point  process  other 
than  (homogeneous  or  nonhomogeneoua)  Poisson  process,  both  the 
uniformity  and  the  randomness  (independence)  disappear. 

Note  that  the  dispersion  test  for  Poisson  variates  (Cox  and 
Lewis,  1966,  p.  158)  is  equivalent  to  testing  the  uniformity  of  the  t^'s 
with  a  chi-square  goodness  of  fit  statistic  (Lewis,  1965). 

b)  The  gap  statistics  for  the  x^'s,  i  =  1,  .  .  .  ,  (n+l),  which,  under 
the  Poisson  hypothesis,  are  a  random  number  n  of  independent  exponen¬ 
tial  variates,  are  formed  from  the  order  statistics  of  the  x^'s  as 

^ni  ~  ^*(i)  "  *(1-1)^^“  +  2  -  i)  (x^  »  0,  i  «  1,  .  .  .  ,n+l).  (3.  1) 

These  are  again  a  random  number  of  independent  exponentials 
(Cox  and  Lewis,  1966,  p.  26)  with  sum  t^. 


and  the  statistics 
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t?  *  S  D 

^  j=l 


*  (*(1)  +  ■*■•••  +  (n  +  ^  -  “  1,  .  .  .  ,n)  (3.  3) 


are,  conditional  upon  n,  order  itatiatica  from  a  uniform  (0,  1)  dlatrl> 

bution  (aee  Cox  and  Lewia,  1966,  p.  154  for  a  more  complete  derivation; 

the  conditioning  upon  n  and  the  fixed  total  t^  ia  very  aubtle). 

A  great  deal  ia  now  known  about  the  null  diatributiona  of  test 

atatiatica  baaed  on  the  gapa  D  ,,  or  on  the  ordered  x, 'a,  and  the 

ni  i 

reader  ia  referred  to  the  excellent  review  articles  by  Pyke  (1965,  1972). 
Alao,  r.gainat  renewal  alternativea,  a  good  deal  ia  known  about  asymptotic 
and,  in  some  cases,  small  sample  power.  Alternativea  are  generally 
specific  diatributiona  on  those  in  which  the  distribution  ia  limited  by 
specifying  that  the  intervals  have,  say,  distributions  with  monotone  increas¬ 
ing  or  decreasing  failure  rate  (Proachan  and  Pyke,  1967).  Again  see 
Pyke  (1972)  for  a  summary,  and  specifically  Jackson  (1967),  Bickel 
(1969),  and  Bickel  and  Dokaum  (1969). 

Empirically  and  from  a  common  sense  point  of  view  it  is 
quite  clear  that  against  stationary  alternativea  testa  for  Poisson 
processes  based  on  the  t^"a  are  more  useful  than  tests  based  on 
the  This  has  been  observed  in  using  the  SASE  program, 
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in  which  the  uniformity  of  the  t^"s  ie  tested  using  both  one-sided  and 
two-sided  Kolmogorov-Smirnov  statistics  and  the  Cramer-von  Mises 
statistic  (Cox  and  Lewis,  1966,  p.  147).  , 

More  information  about  the  power  of  these  pro>:edures,  especially 
against  cluster  alternatives  would  be  valuable  (Lewis,  1969;  Vere-Jones, 
1970). 

c)  There  are  also  specific  tests  against  renewal  alternatives. 

The  Moran  test  (Cox  and  Lewis,  1966,  p.  161)  is  an  example. 

d)  A  useful  test  can  be  based  on  the  empirical  spectrum  of  counts, 
as  pointed  out  by  Bartlett  (1963).  Thus,  let  the  finite  Fourier -Stielties 
transform  of  the  sample  function  N(t)  be 

t 

H  (u)  *  r  e^^*^  dN(t)  (3.4) 

*0  'n.o 

—  1/  2  ^  ^ 

=  (irtj’  '  {  Z  coe(t,w)  +  i  S  sin(t  u)}  (3.5) 

°  J.1  i  J«1  J 

=  (irt  (w)  +  i  (W)},  (3.6) 

0  0 

and  the  periodogram 


I.  (u)  *  (w)  +  (w)}. 

0  0  *0 
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We  will  refer  again  to  the  diatribution  theor>  of  I  (w)  later. 
Noting,  however,  that  for  wt^  =  pZ-rr ,  {u)  ia  approxin  ’itely 

exponentially  diatributed  with  mean  Xx,  and  that  for  am  two  auch 
frequenciea  the  correlation  between  the  periodogram  pointa  ia  ~l/(l+n), 
we  can  aee  that  the  apectral  teat  for  independence  (Section  2)  can  be 
applied  here  to  teat  for  a  Poisaon  proccaa. 

The  main  drawback  here  ia  that  while  the  apectrum  of  intervala 
ia  limited  to  approximately  n/2  periodogram  pointa,  the  count  apectrum 
ia  not  limited  in  auch  a  way.  Since  there  are  roughly  n/2  degreea  of 
freedom  available,  it  ia  a  problem  aa  to  which  n/2  pointa  of  the 
periodogram  with  frequency  of  the  form  wt^  *  Zirp  to  use.  Using  more 
would  invalidate,  for  example,  use  of  standard  distribution  theory  of 
Kolmogorov-Smirnov  statistics  to  test  tlie  cumulated  periodogram. 

Thia  is  a  point  that  needs  considerably  more  work.  The  tests 
are  still  useful  in  an  informal  way,  particuarly  since  the  shape  of  the 
apectrum  can  suggest  physical  reasons  for  departures  from  a  Poissoi* 
process.  In  neurophysiology  (Pe  rkel  et  al,  1967),  the  tendency  has 
been  to  use  the  estimated  intensity  function  (Cox  and  Lewis,  1966,  p.  121) 
rather  than  the  spectrum  of  counts  to  assess  departures  from  the  Poisson 
hypothesis.  This  is  because  many  of  the  neurophysiological  processes 
causing  departures  are  more  simply  expressed  in  time  than  in  frequency. 
However,  the  distribution  theory  for  the  estimated  intensity  function  is 
difficult.  Cox  (1965)  has  discussed  the  distribution  theory  for  the  Poisson 
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4.  STATISTICAL  ANALYSIS 

OF  NON-RENEWAL  POINT  PROCESSES 

We  consider  here  several  non- renewal  point  processes  which  are 
of  great  practical  importance.  For  the  most  part,  however,  the  struc¬ 
ture  is  too  complicated  to  write  down  a  likelihood  function  so  that 
estimation  and  testing  is  ad  hoc.  Examples  of  such  analyses  are  cited. 

4.  1  Cluster  point  processes 

Cluster  point  processes  (branching  point  processes)  are 
important  because  they  arise  naturally  in  practice  and  have  in¬ 
teresting  mathematical  properties  (Lewis,  1969;  Vere-Joner, 

1970). 

Briefly,  a  main  process  (usually  a  Poisson  process)  gmerates 
at  each  point  a  sequence  of  subsidiary  events.  In  Vere-Jones  (1970)  the 
main  processes  are  initial  earthquake  shocks,  the  subsidiary  or  cluster 
events  are  aftershocks.  The  processes  of  subsidiary  evens  are  inde¬ 
pendent  and  can,  in  general,  be  arbitrary  point  processes  which 
terminate  with  probability  one  after  a  finite  number  of  events  occur. 

Two  important  special  cases  arise: 

a)  The  subsidiary  events  are  generated  cumulatively  as  a  finite 
renewal  case.  This  is  known  as  the  Bartlett -Lewis  process. 


b)  The  subsidiary  events  are  generated  additively,  each  one  of  the 

random  number  of  events  being  independently  displaced  from  its  (main) 
generating  event.  The  resultant  subsidiary  process  is  generally  non¬ 
stationary  and  the  complete  process  is  called  a  Neyman-Scott  cluster 
process. 

Computer  failure  patterns  generated  by  this  type  of  mechanism 
have  been  analyzed  by  Lewis  (196-4),  and  earthquakes  by  Vere-Jones  (1970), 
Vere-Jones  and  Davies  (1966),  and  Shlien  and  Toksbz  (i970).  A  fairly  good 
ad  hoc  analysis  can  be  given  for  the  Bartlett-Lewis  process  since  the 
marginal  distribution  of  intervals  is  known  (Lewis,  1964),  as  well  as  the 
spectrum  of  counts  and  in  some  cases  the  spectrum  of  intervals  (Gilles 
and  Lewis,  1967).  The  Neyman-Scott  process  does  net  yield  a  simple 
expression  for  the  interval  distribution  and  it  is  not  yet  known  if  the 
coefficient  of  variation  of  the  intervals  is  greater  than  one,  as  it  iS 
for  the  Bartlett-Lewis  process. 

These  cluster  processes  are  overdispersed  relative  to  a  Poisson 
process  and  the  variance  time  curve  has  an  asymptotic  form  which  is 
independent  of  the  fine  structure  of  the  subsidiary  processes.  This  is 
a  help  in  analyzing  the  data;  basically  for  large  time  periods  the  counts 
of  events  N(t)  behave  as  though  the  subsidiary  events  were  concentrated 
at  the  main,  generating  event,  i.  e.  ,  like  a  bulk  Poisson  process. 


P«ipite  these  points,  the  nituation  with  regard  to  cluster  processes 

I 

is  unsatisfactory;  questions  such  as  the  discrimination  of  Bartlett -Lewis 
and  Neyman-Scott  processes  arise  in  practice  and  are  not  solved  (see  ' 
the  discussion  in  Vere-Jones  (197  0)). 

When  the  cluster  process  has  clusters  with  only  one  event,  the  pro¬ 
cess  is  equivalent  to  an  infinite  server  queue.  For  Poisson  main  events 
(pa!^rameter  X.)  the  process  of  subsidiary  events  is,  in  equilibrium,  a 
Poisson  process  and  the  delay  distribution  cannot  be  determined. 

An  important  special  case  occurs  when  the  main  process  is  regu¬ 
lar  and  each  event  is  independently  delayed.  Appointment  processes  are 
very  often  of  this  type  and  the  determination  of  the  delay  distribution  from 
observation  of  the  subsidiary  events  (arrivals)  has  been  considered  in 
detail  by  GovieV  and  Lewis  (1967)  for  several  delay  distributions.  Since 
the  variance-time  curve  has  a  finite  limit,  they  have  called  these  pro- 

I 

cesses  with  controlled  variability. 

\ 

4.  2  Superposed  processes 

Statistical  analysis  of  processes  which  aro  superpositions  of 
point  processes  is  important,  particularly  in  neurophysiological  contexts, 
and  usually  hinges  on  questions  of  estimating  the  number  of  contributing 
processes  or,  when  this  is  known,  identifying  the  structure  of  the  com¬ 


ponent  processes. 
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These  are  difficult  questions  whose  solution  has  not  progressed 
much  beyond  the  basic  work  of  Cox  and  Smith  reported  in  Cox  and  Lewis 
(1966,  Chapter  8).  Without  specific  assuniptinns;  i.  e.  ,  that  the  com- 

I 

ponent  processes  are  renewal  processes  with,  say,  gamma  distributed 
intervals,  very  little  can  be  done  to  estimate  the  number  of  processes. 

The  problem  of  Walloe  et  al  (1969)  is  of  this  kind,  the  nature  of  the 
input  processes  and  the  neural  mechanism  being  well  known,  the  main 
question  being  how  many  inputs  impinged  on  the  neuron. 

Identifying  the  component  processes  is  again,  difficult,  without 
specific  assumptions.  Work  such  as  that  of  Ross  (197  0)  on  identifying 
the  interval  distribution  in  a  known  number  k  of  superposed  processes 
is  rather  technical;  note  that  the  variance  time  curve  and  spectra  of 
counts  are  additive;  e.  g.  ,  the  spectrum  of  counts  of  the  superposed 
process  is  k  times  the  spectrum  of  the  individual  processes.  Thus,  since 
the  spectrum  of  counts  of  a  renewal  process  is  simply  related  to  the  Laplace 
transform  of  the  probability  density  of  the  intervals,  ^*(")' 


j  i*(iw)  f*(-l«) 

gjj(u)  0  +  +  1  -f*(-i«)  J 


(4.  1) 


where  p  is  the  mean  of  x,  the  spectrum  of  the  superposed  process 
g^(u))  «  kg^(u)  and  it  is  natural,  and  probably  efficient,  to  use  the 
estimated  spectrum  or  the  estimated  covariance  density  to  estimate 
f  (x).  Convergence  follows  from  results  of  Brillinger  (1972). 
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Nonhomogeneous  flupeipusitions  also  occur  (Blumenthal,  Green¬ 
wood,  and  Herbach  (1971)),  but  as  in  the  homogeneous  case,  when  the 
number  of  contributing  processes  is  large,  the  resultant  process  over 
realistic  periods  of  observation  is  almost  indistinguishable  from  a 
(homogeneous  or  nonhomogeneous)  Poisson  process. 

4.3  Doubly  stochastic  point  processes 

Let  y\(t)  be  a  real  valued,  nondecreasing  stochastic  process. 

A  doubly  stochastic  point  process  is  a  generalization  of  the  nonhomo¬ 
geneous  Poisson  process  in  which  the  integrated  intensity  function  is 
replaced  by  Jit).  Thus,  given  a  realization  of  A(t),  the  point  prc  cess 
is  a  nonhomogeneous  Poisson  process.  Generally  Jit)  is  differentiable 
and  \  (t)  =  A'(t)  might  be  a  stationary  stochastic  process,  a  determin¬ 
istic  trend  or  a  combination  of  both.  The  pro«.eB8  was  introduced  by 
Cox  (1955);  see  Bartlett  (1966,  p.  325),  Cox  and  Lewis  (1966,  p.  179), 
and  Grandell  (1970)  for  details.  Gaver  (1963)  considered  the  case  where 
X  (t)  changes  level  at  random  times  and  called  the  process  a  random 
hazard  process. 

The  doubly  stochastic  mechanism  in  a  point  process  is  very 
realistic  and  probably  quite  common.  Thus  computer  failure  processes 
depend  to  some  extent  on  temperature,  humidity,  etc.  Unfortunately, 
analytic  properties  of  the  process  when  X  (t)  has  a  stochastic  element 
are  very  difficult  to  derive;  see  Lawrence  (1972).  If  X  (t)  is  a  stationary 


stochastic  process  with  mean  variance  o  and  autocorrelation 

\ 


fxinction 


00  -iUT 

p(t)  *  l/?n  /  p(t)  e  dF(u), 
-  00 


(4.  2) 


where  F(u)  in  the  Integrated  spertruni  and  f(u))  =F'(l')  when  it  exists 
then  several  useful  results  can  be  obtained  (Cox  and  Lewis,  1966, 
p.  179-183).  In  particular,  the  covariance  density  y^lt)  of  the  doubly 
stochastic  Poisson  process  is 


(4.  3) 


and  the  spectrum  of  counts 


g^(w)  =  X/tt  +  2f(w). 


(4.  4) 


Also,  in  general,  the  index  of  dispersion  is 


I(oo)  =  1  +  2  ^  /  p{u)  du, 

X.  0 


(4.  5) 


so  that  the  process  is  overdispersed  relative  to  the  Poisson  process. 

There  has  been  a  great  deal  of  interest  in  the  doubly  stochastic 
Poisson  process  in  physics,  optics,  and  engineering,  and  some  work 


on  the  eitimation  problems. 


a)  In  physics  phot'''  emission  is  a  well  understood  physical  process, 

and  it  is  known  that  the  process  modulating  the  Poisson  emission  is,  for 
instance,  an  Ornstein-Uhlenbeck  process.  The  parameters  of  this 
process  are  physical  parameters  which  it  is  of  interest  to  estimate, 
and  attempts  have  been  made  (e.  g.  ,  Pusey,  1971,  Jakeman,  Pyks  and 
Skvain,  1971,  and  Koppel,  1971)  to  do  this  via  the  spectrum  and  the 
relationship  (4.  4).  Again,  laser  light  is  deliberately  focused  on  a  physical 
system  and  the  resultant  intensity  fluctuations  in  the  scattered  light  re¬ 
flects  rates  of  molecular  motions  and  interactions  in  the  system. 

In  most  cases  the  number  of  photon  counts  in  these  experiments 
is  very  large  and  it  is  convenient,  and  very  often  necessary  to  cumulate 
counts.  There  are  problems  of  determining  the  best  sampling  interval, 
a  problem  which  is  often  made  simpler  by  detailed  knowledge  of  the 
modulating  process. 

Perhaps  because  of  the  large  amount  of  data  involved  and  the 
fact  that  the  X.  (t)  process  is  fairly  well  understood,  physicists  rarely 
bother  about  details  such  as  the  efficiency  or  optimality  of  estimation 
procedures.  Moreover,  the  spectral  estimation  is  very  often  automated 


and  done  digitally. 
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b)  In  engineering,  two  areai  are  involved.  One  is  biomedical 
engineering  in  whici\  for  example,  radioactive  substances  are  injected 
into  the  blood  stream  and  the  counts  of  the  radioactive  emission  are  used 

I 

to  estimate  a  decay  function  which  is  related  to  a  physical  process  of 
interest  (Snyder,  1971). 

A  second  area  is  optics,  where  modulated  light  is  used  to  trans¬ 
mit  Information  (Reiffen  and  Scherman,  1963;  Bar-David,  1969;  Karp 
and  Clark,  1970;  Clark  and  Hoversten,  1970).  Here  X.  (t)  is  very  often 
a  signal  changing  levels,  a  possible  problem  being  the  discrimination 
of  several  X.  ^(t)'s  from  photon  covints  of  the  noise.  Very  often  a 
stationary  "noise"  element  is  also  present;  see  Bedard  (1966)  for  the 
physical  considerations. 

In  all  of  this  engineering  literature,  there  is  much  concern 
with  optimality,  in  some  sense,  and  procedures  are  based  on  likelihood 
ratios,  Bayesian  posterior  statistics  and  maximum  likelihood  detectors. 

It  is  a  difficult  literature  to  penetrate  and  my  overall  impression  is  that 
there  are  many  hidden  assumptions  involved,  one  being  that  samples 
are  large,  the  other  a  normality  assumption  which  may  be  quite  incorrect. 
Moreover,  most  explicit  results  are  for  very  simple  situations  such  as 
mixed  Poisson  processes  of  one  type  or  another,  or  rates  (Reiffen  and  Sherman, 
1963;  Snyder,  1972)  changing  at  known  times.  The  latter  problem  is  very  simple 
and  straightforward,  especially  when  compared  to  the  case  of  unknown  change 
points  which  gives  a  true  doubly  stochastic  Poisson  process  and  an 
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inference  problem  similar  to  the  difficult  change  point  problem. 

c)  Grandell  (1971,  1972)  has  considered  inference  in  doubly  stochastic 
processes  from  the  viewpoint  of  mathematical  statistics,  his  motivation 
apparently  being  primarily  problems  in  actuarial  work.  His  approach  is 
quite  different  from  that  in  the  engineering  literature,  cited  above,  and 
he  has  considered,  for  example,  optimum  estimates  of  £{X.(t))  based 
on  linear  combinations  <f  the  data.  The  problems  are  related  to  the 
general  problem  of  curve  estimation  fronri  random  data  and,  while 
Grandell  (1972)  gives  specific  examples,  much  remains  to  be  done. 
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5.  TREND  ANALYSIS 

IN  NON-HOMOGENEOUS  POISSON  PROCESSES 

V>'e  dlflcuii  now  th«  analyiis  of  tranda  In  non-homoganaoua  Pcl:«on 
procasaaS(  axtanding  tha  raittlta  of  Cox  and  Lawla  (1966,  Ch.  3).  Initlallyi 
wa  diacuaa  raaulta  baaad  on  apaciflc  paramatric  modala  for  tha  rata  function 
X(t)  of  tha  non-homoganaoua  Polaaon  procaaa.  Thaae  raaulta  ara  baaad  on 
tha  fact  that  tha  likallhood  function  for  n  obaarvationa  in  tha  fixad  pariod 
(0,  tp]  at  timaa  - ^ 

L(tj, .  .  .  ,t^,  n;p)  =ini  X(t^;  0)  axp  {  - X(u;e)  du},  (5.1) 

whara  ^  danotaa  tha  v-actor  of  paramatara  in  tha  model.  Moraovar,  givan 
that  n  avanta  occur  in  (0,tQ],  tha  timaa  to  avanta  t^  ara  tha  ordar  atatia- 
tica  from  a  random  aampla  from  tha  probability  danaity  function 


f(t;0) 


X(t;^ 

Jq^  OX  (u;§)du 


X(t;Q)  ^ 

^V2)  ’ 


(5.2) 


and  tha  conditional  likallhood  ia 

L(tj . 

Latar  in  tha  aactlon  wa  diacuaa  procaduraa  for  axamining  the  adequacy 
of  t)ta  modal  for  \.\t)  and  adequacy  of  tha  non-homoganaoua  Polaaon  procaaa 
modal  itaalf. 

5.  1  Monotona  and  evolutionary  tranda 

Tha  aatlmata  of  X(t)  whan  there  ia  no  trend  praaant  i.  a.  X(t)  =  X,  la 

n/t,. 


In  Cox  and  Lawlc  (1 966^ Chapt* r  3)  tha  axponantlal  llnaar  trand 
X(t)  =  axp  t)  =  X.#xp{#  t} 


(5.4) 


was  dlscusaad  and  It  was  shown,  using  (5.  1),  that  (n,£t^)  wars  a  sat  of 
sufflciant  staHatlca  for  («,^).  A  tast  for  fi -0  against  fiPO,  whsra  a  Is 
a  nulsanca  paramatar,  la  basad  on  tha  distribution  of  £t^/n,  glvan  n.  This 
Is  an  optimum  (conditional)  tost.  Tha  maximum  likalihood  aatimata  of  fi 
was  also  givan  implicitly,  but  no  small  sampla  props rtlos  of  the  sstimata 
ara  known. 

Tha  tast  for  fi  =  0  was  appllad  In  Saction  1  to  analyza  a  aaquanco  of  ar¬ 
rivals  at  an  intansiva  cars  unit  (Tabla  1)  and  a  sectionad  analysis  indicated 
that  tha  trend  was  not  monotone,  although  same  overall  increase  In  tha  rata 
was  present. 


An  exponential  quadratic  trend  uses  the  modal 


X(t)  =exp  {a  +^t  +  yt^}*  X.exp{^  t  +  'Yt^}, 


for  which  we  get 


lnL(«,^,  y)  =  n  In  K  +y 


(5.5) 


(5.6) 


^0  2 
-  X  /q  exp  {fin  +  yu  )  du. 

The  maximum  llkellliood  estimates  of  X,  ara  givan  as  tha  solution  of  tha 


aquations 


^  *  n/  {/J^axp  (^u+^^)du}. 


(5.7) 


Ztj  /q°u  axp($u  +yu^)du 

n  *0  ^  ^  2 

[ ^•xpifin+yn  )du 


(5.8) 
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#02  -  -  2,  ^ 

£t|  u  axpl^u^vu  )du 

—  -  - - - -  -  0.  (»•’) 

/q«*P(^u+^u  )du 

2 

and  It  la  claar  that  ('j,  £t^,  I^t^  )  ara  a  aat  of  au/flclant  atatlatica  for 

(•.  fi,  yl 

Thara  ara  aavaral  Intaraatlng  opan  quaatlona  hara. 

What  ara  tha  amall  aampla  propartiaa  of  tha  aatlmatora  and  what  ara 
their  large  aampla  propartiaa?  Note  that  tha  uaual  theory  for  maximum 
likelihood  aatlmataa  la  not  directly  applicable  bacauaa  of  tha  random  aampla 
alza. 

l\1iat  affect  doaa  tha  quadratic  term  (1.  a.  v/0)  have  on  tha  aatlmataa 
of  fi  In  tha  modal  (S.  4)  ? 

For  tha  arrival  data  tha  following  aatlmataa  ware  obtained. 

Linear  modat  X  -  0.6342]  ^  a  0. 00014ZZ|  ^  . 

Quadratic  modal:  X»  0.4792]  fi  =  0.  0009788:  v=-0.  4481*  1(J 

A 

Not*  tha  large  dlffaranc*  In  fi  In  the  two  caaaa. 

Another  problem  of  Inter* at  la  to  teat  for  tha  quadratic  term  in  the 

trend.  It  la  claar  from  Table  1  that  either  a  quadratic  team  or  a  long  cycle 

la  nacaaaary  to  adequately  modal  the  annual  data  from  Section  1. 

A  formal  teat  can  be  baaed  on  tha  Idea  that,  for  any  glvan  >,  n  and  £t. 

1 

ara  a  aet  of  aufflclant  atatlatica  for  a,fi.  Tharafor*  a  teat  for  y  la  baaed 
on  the  atatlatlc  £t^  and  Ita  conditional  dlatrlbutlon,  glvan  n  and  Ft^.  Thla 
dlatrlbutlon  la  difficult  to  obtain  analytically,  Zt^  balng  tha  aquar*  of  the 
dlatanc*  to  tha  aampla  point,  which  la  conatralnod  to  11*  In  tha  (n-I)>dlman- 
alonal  hyparplan*  defined  by  Zt^  a  C. 
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For  Urga  aamplaa  this  diatrlbutlnn  can  ba  ohtainad  from  tha  fact 
that  Zt^/n  and  Zt^/n,  conditlonad  on  n«  ara  Jointly  normally  dlatrltutad 
for  larga  n,  and  tha  following  axact  momant  reaulta: 

t 

i^l  =  E{  Ztj/n)  =  Iq/Z;  <rj  =  var  {  E  t^/n)  =  (UnfS 

11^=  E{ZtJ/n}=  tJ/S;  o  ^  =  var  {  Zt^/n)  ■  41^  (45nr^ 

/  ^^4  ^^i  \  nM  5  -  OAfl 

p=  corr{— — *  }  «  -  0.9o8. 

^  n  n  4 

2 

Thus,  using  normal  thaory  raaults  we  teat  with  Zt^  /n  having  a  normal 
dlatrlbution  with  maan  and  deviation 

'‘^’‘2*'^  T  * 


..2  M  .  ^  /2t, 

n 

tr  =<rj  (1-p^)^. 


_o  ). 
2 


(5.10) 


(5.11) 


For  tha  complata  sat  of  arrival  data  Zt^/n  =  954.  25,  giving  ji  =  1,  187,783, 
and  (T  =  6,  530,  E  t^Ai  =  1,  161,  565,  giving  (  Et^ /n- ji)/(r  =-4.015  and  this 
la  cloarly  a  highly  significant  rajacHon  of  tha  hypothesis  that  \  =  0. 

Boswell  (1956)  and  Boswell  and  Brunk  (1969)  have  considered  tests 
of  the  hypothesis  that  X(t)  is  not  constant  but  is  non-dacreasing.  Using  a 
likelihood  ratio  criterion,  conditional  on  the  fixed  value  of  N(tQ)*n,  he 


found  tha  test  statistic 

“  -  -1 

<ai 


(5.12) 


wha  ra 
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tha  null  hypothasli  being  rejected  for  large  values  of  the  statletlc. 

The  statistic  (S.  1 2)  Is  not  simple  to  compute,  but  Boswell  gave  an  ' 
Iterative  procedure  for  the  computation,  and  same  results  on  the  limiting 
distribution  of  the  statistic. 

It  would  be  of  Interest  to  compare  the  power  of  this  test  against  the 
power  of  the  test  based  on  Bt^ /n  for  the  exponential  linear  trend.  Some 
results  have  been  obtained  by  Mr.  Ian  White  of  the  University  of  Edin¬ 
burgh. 

5.  2  Cyclic  trends  of  fixed  frequency 

Cyclic  time  trends  (as  opposed  to  cycles  on  serial  number)  occur 
frequently  In  point  processes  but  were  treated  only  as  an  exercise  In  Cox  Ic 
Lewis  (1966).  An  example  of  such  a  series  Is  given  in  Forrest  (1950),  who 
was  Investigating  thunderstorm  severity  In  Great  Britain  and  its  effect  on 
power  lines.  He  found  a  tendency  for  thunderstorms  to  occur  in  the  morning 
(time  of  day  effect)  and  of  course  a  very  strong  time  of  year  (scasonal)effect. 

The  arrival  data  discussed  in  Section  1  might  be  expected  to  have 
such  effects,  although  a  time  of  day  effect  is  by  no  means  evident  since  there 
is  only  about  one  arrival  every  day  and  a  half.  In  Figure  6  we  show  a 
collapsed  plot  of  the  numbers  of  arrivals  in  the  successive  hourly  periods 
of  all  days  of  observation.  The  plot  should  be  compared  to  Figure  1. 8a  in 
Cox  It  Lewis  (I966)  for  the  arrivals  from  4  February  1963  to  18  March  1963. 
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Figure  6.  Arrival  of  patient*  at  an  intensive  care  unit. 

Collapsed  hourly  plot  of  number  of  arrival*  to  investigate  "time-of-day"  effect. 
Second  period  of  observation:  19  March  1964  to  8  Febr\iary  1968,  Total  number 
of  arrival*  (including  ties);  1216.  Observation  time  1420  days;  m  «  0.  856. 
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Figure  7.  Arrival  of  patients  at  an  intensive  care  unit. 

Collapsed  daily  plot  or  arrivals.  All  arrivals  including  ties.  Solid  line 
is  2nd  period,  19  March  1964  to  8  February  1968  (1216  arrivals  in  1420 
days).  Dashed  line  is  complete  record.  4  February  1963  to  8  February 
1968  (1467  in  1829  days).  ,i  (n,  -  n)^/(?r)  =  1 2,  93,  Note  th..t 
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The  plot  for  the  complete  period  (4February  1963-6  February  1968)  In 
Figure  6  is  ■imllar  in  shape  to  the  plot  for  the  earlier  period  and  no  formal 
atatletlcal  teat  la  needed  to  conclude  that  there  is  a  real  tlma-of-day  effec^. 

As  a  model  for  the  rate  \(t)  in  a  non-homogeneous  Poisson  process 
with  fixed  frequency  (jQ=2ir/TQ  we  take 


\.(t)  =  exp  {«+k^  sin  +  k^  cos  u^t} 


*  X.  exp  {k  sin  (10^1+6)}  , 


(5.14) 

(5.15) 


io(k) 


1  k 
-1  ,  c 


2  2  4'  —  I  - 

where  k={k^+k^r,  9  =  tan’‘  (  ^  )  ,  X  =  exp(®;xlQ(k), 


and  lQ(k)  is  a  modified  Bessel  function  of  the  first  kind  of  zero  order.  This 
reparameterization  is  used  because 


exp  {k  sin  ( -^^  +  0  )  }  dt  =  ’ 


(5.  16) 


so  that  if  t^,  the  total  period  of  observation  is  a  multiple  p  of  (here  one 
day),  we  have 


A(to)=  /q  X(u)du  =  X  t^  »  XpT^. 


(5.  17) 


In  effect  we  have  separated  out  multi p lie atively  a  linear  growth  from  a 
cyclic  effect  which  takes  place  only  within  periods  of  length  T^. 

The  reason  for  using  the  form  5.  14  rather  than,  say. 


X(t)  *  a  +  k  iin  (cj^t  +0 ) 


(5.18) 


is  that  X(t)  must  be  positive,  and  to  achieve  this  with  (5.  18)  requires  k<a. 


1 


si 


For  k  «  O'  the  two  modela  are  essentially  the  same.  In 

addition,  the  rate  (5.  14)  gives  simple  results  based  on  sufficient 

« 

statistics,  this  being  closely  connected  with  the  fact  that  the  density 
(5. 2)  belongs  to  the  exponential  family  of  density  functions. 

An  additional  reason  for  using  (5.  14)  is  that  such  nonsymmetrical 
cycles  are  probably  more  natural  with  series  of  events,  possibly  because 

of  the  positivity  of  the  rate  fvmction.  Professor  J.  W.  Tukey  has  suggested 

\ 

a ’model  which  is  (5.  18)  squared;  this  could  be  useful  with  data  in  which  there 
are  a  large  number  of  arrivals  in  each  period,  so  that  regression  techniques 
using  the  square  root  of  the  numbers  of  arrivals  in  fixed  intervals  can  be 
used  (Cox  and  Lewis,  1966,  Chapt.  3).  This  model  has  also  been  used  by 
Fisher  (  1953). 

In  Lewis  (197  0)  the  following  results  were  given  for  the  non-homo- 
geneous  Poisson  process  with  the  rate  \  (t)  given  in  (5.  14). 

Using  (5.  3)  we  find  that  the  observations  enter  into  the  likelihood 
function  only  through 

n  n 

n,  A  (u^)  =  T,  cos  (i)„t.,  B  (w  )  »  Z!  sin  w  t.,  (5.  19) 

‘o  0  i-1  ‘o  ° 

and  these  are  a  set  of  sufficient  statistics  for  the  parameters  a,  k^ 
and  k  in  (5.  14).  Note  that  A  (cj  )  and  B  (w  )  are  the  components 
of  the  periodogram  (3.6). 


Maximum  likelihood  estimates  of  X-  ,  9,  k  are 
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1 


(5.  20) 


(5.  21) 


and  k  ia  the  aolution  of  the  equation 


(l/n)fA^  (o)  )  +  sf  (0,  )) 
0  0 


Ij(k) 

ylo 


•Wk). 


(5.22) 


where  Ij(k)  is  the  derivative  of  lQ(k)l  and  4<(k)  increases  monotonically 
from  0  to  1  as  k  goes  from  0  to  go.  This  later  fact  allows  one  to 
use  the  Neyman-Pearion  lemma  to  show  that,  conditional  on  the  observed 
value  n  of  1^(1^),  the  most  powerful  test  of  k  =  0  against  k  0  is 
based  on  the  statistic 


''f '“o'  '“o’ 


'"'o/"’ 


(5.  23) 


Since,  or  k  =  0,  21^  (u^)  has  asymptotically  (Cox  and  Lewis, 

1^66,  Chapt*-r  5)  an  exponential  distribution,  the  test  for  k=0  reduces 
to  accepting  the  hypothesis  at,  say,  a  5%  level  if 


(“q)  *  (tQ/n)=  (S/Tr)X  m. 


(5.24) 


The  factor  6  arises  because  prob  f)(  *  6}  =  .  95,  where  x  is  a 
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random  variable  with  a  chi-aquare  distribution  of  two  degrees  of  freedom. 

This  test  applied  in  Lewis  (197  0)  to  the  first  section  of  the  arrival 
data  gave  a  strong  indication  of  the  presence  of  the  cycle  at  a  period  of 
one  day. 

For  the  complete  record  of  arrivals  at  an  intensive  care  unit  dis¬ 
cussed  in  Section  1  we  get  for  the  periodogram  at  p  =  1829  or 
which  is  the  day  frequency,  since  t^  =  1829  and  =  2iTp/tQ^ 

21^  (cjg)  =  27.  094  and  Sm/ir  =  1.  52. 

This  is,  not  surprisingly,  highly  significant.  For  the  first  409 
arrivals  the  results  are  (Lewis,  1970) 

21  (u  )  =  5.  0  and  bm/ir  *  1.  10. 

Thus  the  periodogram  component  is  increasing  roughly  in  proportion  to 
n,  as  it  should  for  a  true  cyclic  component. 

When  Lewis  (197  0)  was  written,  the  connection  of  this  model  for 
a  cycle  of  fixed  frequency  with  tests  for  directionality  on  the  circle  was 
not  realized.  In  fact,  the  conditional  test  (5.24)  is  equivalent  to  testing 
that  n  observations  on  a  circle  (here  a  24-hour  clock)  have  the  von  Mises 
or  circular  normal  distribution  (Gumbel,  Greenwood  and  Durand,  1953) 
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f{i)  = 


exp{k  Bin(i  +  9)} 

2  IT  Iplk) 


(0^1^  Zn).  (5.  25) 


For  k  =  0,  this  is  a  uniform  dlitribution;  otherwise,  it  has  a  mode  at 
9,  the  vector  in  the  direction  0  being  called  the  modal  vector.  Green¬ 
wood  and  Durand  (1955)  obtained  the  distribution  of  the  square  of  the 
resultant 


R 


2 


=  ( 


n  2 

S  cos  f .) 
i=l  1 


+ 


(  S  sin  f .) 
i=l  1 


(5.26) 


of  n  such  vectors  when  k  *  0,  generalizing  earlier  work  of  Pearson 
on  the  problem  of  random  walks  on  the  circle.  The  formal  analogy  of 
(5.  26)  with  the  periodogram  (3.  5)  should  be  clear.  Watson  and 
Williams  (1956)  generalized  these  distributional  results,  using  results 
for  sufficient  statistics,  and  found  the  conditional  p.  d.  f.  of  the  quantity 
on  the  left  of  equation  (5.  22),  which  we  denote  by  r,  as 


r)  =  r  Ig(kr) 


(Jq(u)}”  jQ(ru)du/  flp(k)}'^. 


(5.  27) 


where  Jq(  •  )  is  the  ordinary  Bessel  function  of  zero  order. 

It  is  not  all  apparent  that  (5.  27)  is  more  useful  for  computation 
than  the  generating  function,  which  is  given,  for  example,  for  k  =  0 
in  Cox  and  Lewis  (1966,  Chapter  5)  in  the  discussion  of  the  distribution 
of  the  periodogram  at  one  point  where  is  of  the  form 


1 
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=  ZiTp,  fcr  p  integer.  However,  Stephens  (1969a,  1969b)  has 
used  (5.  27)  to  tabulate,  among  other  things,  the  power  of  the  test  that 
k  -  0  against  alternatives  k  >  0.  The  most  complete  discussion  of  these  tests 
for  the  von-Mises  distribution  is  given  In  Watson  and  Williams  (l956).  The 


function  ‘i^(k)  is  tabulated  by  Gumbel,  Greenwood,  and  Durand  (1953).  Stephens 
(1969b)  has  also  discussed  tests  for  0  =  9^,  and  joint  tests  for  k  and  0, 

It  is  clear  that  problems  involving  cycles  at  two  or  three  fixed 
frequences,  e.  g.  ,  tj^,  will  arise  in  analyzing  series  of  events. 

For  example,  in  the  proolem  of  analyzing  the  arrivals  at  an  intensive 
care  unit,  a  time  of  week  effect  and  a  time  of  year  effect  are  distinct 
possibilities.  Surprisingly  enough,  there  does  not  seem  to  be  a  strong 
time  of  week  effect  in  the  data,  but  there  is  not  space  here  to  go  into 


this. 


Formal  teste  for  more  than  one  cycle  follow  from  results  in 


Cox  and  Lewis  (1966,  Chapter  5)  to  the  effect  that  at  two  different 
frequencies  and  both  of  which  wiien  multiplied  by  t^  are  2tt 

times  an  integer,  the  correlation  between  the  perlodogram  values 


I  (Wq)  and  I  (Wj)  is  approximately  (1  +  n)  Thus,  a  test  for  a 
time  of  week  effect  (w^),  based  on  the  conditional  distribution  of  I((*)j), 
given  the  values  of  reduces  in  effect  to  an  inde¬ 

pendent  test  at  Uj  based  on  (5.  23). 

For  the  arrival  data,  considered  over  a  period  of  1456  days, 

we  get  for  p  *  208  (w  ■  Ztt/T),  or  a  period  of  one  week,  the  value 

P 


96 


A  (w  )  (u)  ) 

2  ‘o  P  ‘o  P 

0  0  0 

Thus  there  ii  no  formal  indication  of  a  time-of-week  effect. 

5.  3  The  ipectrum  of  counte  and  cyclea  of  unknown  frequency 

In  the  prevdoua  aubsection,  we  ahowed  the  connection  between 
teati  for  a  cyclic  component  at  a  known  frequency  in  a  nonhnmogeneoua 
Puia aon  procea a  and  the  perlodo({rani ,  the  pcriodo(;ram  beln|{  the  baaia 
for  eatimation  of  the  (aecond  order)  apectrum  of  counta  K^(u). 

Clearly,  one  might  want  to  look  (or  cycles  at  unknown  frequency  and 
thia  will,  intuitively,  be  baaed  on  the  apectriun  of  counta.  More  pre¬ 
cisely,  it  will  be  baaed  on  the  periodogram 

I  (w)  ■  (1/  TTt^  )  f  (w)  +  (w)  }  .  (5.  2  8) 

'o  'o  'o 

The  analogous  problem  in  ordinary  time-series  analysis  is  the 
classical  problem  of  hidden  periodicities,  diacussed  at  length  by  Hannan 
(1970,  p.  463).  Thia  problem  has  not  been  tackled  in  point  processes, 
and  ia  more  difficult  for  two  reasons.  First,  the  periodogram  points 
are  not  quite  uncorrelated,  as  we  saw  in  the  preN-ious  section,  and  also 
the  spectrum  is,  In  theory,  not  limited  in  the  frequency  domain.  (There 
will  always  be  some  band  limiting  due  to  jitter;  see  Lewis,  1970.  ) 

Thus,  it  ia  a  problem  how  to  use  the  distribution  theory  (or  the  spectrum 
given  in  Bartlett  (1963)  and  Cox  and  Lewis  (1966,  Chapter  5)  and  how  to 


pick  relevant  parti  of  the  ipectrum  from  which  to  eitlrnate  the  unknown 
frequency. 

Ai  an  example,  vue  give  in  Figure  H  the  ipectrum  of  the  arrival 

data  of  Section  1.  The  peak  for  the  day  effect  at  p  ■  1329  (u  ■  2it)  la 

P 

evident,  but  it  would  be  difficult  to  read  into  thii  spectrum  anything  but 
a  conclualon  that  it  ia  a  nonhomogeneoui  Poiaaon  proceaa  with  a  aingle 
fixed  cycle  (day -effect).  There  ia  poaalbly  a  harmonic  at  p  ■  3658 
and  aome  inhibition  at  low  frequenciea,  but  all  the  points  (except  for 
p»  1829)  are  within  1  percent  confidence  bands  for  individual  values, 
and  would  be  within  bands  for  the  maximum  of  the  periodogram  values 
(Bartlett,  1963). 

The  ipectrum  of  counts  ia  generally  a  useful  tool,  and  even  more 
particularly  «o  for  non-Polaaon  proc  e  a  ses  ,  but  has  found  very  limited 
acceptance  amongst  applied  workers.  This  is  partly  due  to  confusion 
with  the  spectrum  of  intervals,  with  which  neurophysiologisti^  for  example 
are  much  more  familiar. 

Lewis  (1970)  has  given  a  heuristic  justification  for  the  spectrum 
of  counts  and  discussed  computation  and  smoothing.  Bartlett  (1967)  has 
discussed  finer  points  in  this  type  of  analysis. 

Finally,  Brlllinger  (1972)  has  given  a  general  theory  for  spectral 
analysis  of  point  processes,  and  has  defined  higher  order  spectra 
(cumulant  spectra  of  order  k)  for  point  processes.  These  are  gener¬ 


alizations  of  the  cumulant  spectra  of  order  k  of  continuous  time  aeries. 


S£RUU.  Of  MCX)LE  IN  G«OOP  Of  20  iNTER  ARRivAL  TAtES 


Av*ra 


>t(ure  9  Arrival  of  pallrnlt  at  an  intcnalvr  larr  unll. 

•  •  of  •\iccaa«lvc  group#  (»f  .^0  Inter  -  arrival  fliive#  after  detrending  ith 

•  •  ^  ^  ^  ^ 

eip(ae  4  *  k#ln(a*«  u*f))du.  <*firre  a,  4  V. ,  0  are  rnaai  rfiurn  i  Ik  e  1 1 


hood  eitUnale#.  n  •  14  arrival#  in  IH.'4  day#  frorr\  4  february  19rl  to 
B  February  19f>{t  Overall  average  after  detrending.  l.OOHH.  HorTiogencity 
of  variance  elatletlc  ••  109.  0’  r  ft  ran  71.  <$  •  11.  9. M 
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BriUlns.r'.  work  i.  ba..d  on  a  Ban.r.l  .peclral  repraaentation  for  N(t). 
th.  counting  proc...  of  th.  point  proc...,  and  provida.  an  a.ymptotic 
thaory  tor  tha  .pactr.l  a.timata.  of  ordar  2  (pariodogram)  con.id.rad 
in  thi.  papar.  Brillingar'.  papar  i.  far  too  axtan.iva  to  do  >° 

hara.  Tha  problam  of  whathar  cumnlant  .pactra  of  ordar  higher  than  2 
will  ba  naaful  in  practlca  ramalna  opan  and  ahould  ba  axplortd. 
Brllling.r'r  .pactral  d.compo.itlon  may  provida  an  an.war  to  whathar 
.pactral  analy.i.  can  ba  u.aful  a.  a  rapra.antaticn  in  real  ay.tatna;  l.a.. 

neuron*. 


5.  4  Mixed  models 

Tha  arrival  data  con.idarad  in  Section  1  ha.  bean  .hown  to  have 
a  long  term  trend  which  can  ba  rapra.antad  by  the  axponantiai  quadratic 
, unction  ,S.  S,  and  a  atrong  day  cycia.  A  comb.nad  modal  lor  th.a  daU 
could  ba  a  nonhomoganaou.  Poi.aon  proca..  with  rata  X  (0  of  tha  form 


.  axpfu  *  Pt  t  yt  ♦  k  .inlu^t  +  91). 


(5.  29) 


K  (t)  -  exp 


This  is  difficult  to  handle  because  aH)  =  ^  ^ 

able.  However,  in  tha  conta.t  of  tha  arrival  data  the  avolu.lonary  tr.nd 
change,  little  within  a  cycle  and  an  appro.imat.on  in  wh.ch  tha  linear 
and  quadratic  term,  are  a..umad  con.tant  over  tha  parted  o,  each  cycle 
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yields  a  tractable  model.  The  estimates  of  k  and  6  am  then  simply  (5.  ?0) 
and  (5,  ZZ),  y  and  p  are  estimated  from  (5.  8)  and  (5.  9)  and  \  is 
given  by  (5.  7)  with  ^Q(k)  multiplying  the  term  in  the  denominator. 

For  the  arrival  data,  we  obtain  the  following  values; 

X  =  0.  4792; 

k  »  0.  654; 

9  =  3.  519: 

P  r  0.  0009788; 

V  *  -0.  4481  X  lo'^. 

Clearly,  mixed  models  such  as  (5.29)  will  be  needed  in  many 
cs  ses  for  the  analysis  of  real  data. 

5.  5  Residual  analysis  of  point  processes 

Many  observed  point  processes  have  rate  functions  which  are 
clearly  not  easily  representable  by  simple,  tractable  models  such  as 
those  discussed  in  the  previous  section.  For  example,  it  is  common 
in  observing  arrivals  of  jobs  in  a  computing  center  to  find  (roughly) 
the  rate  increasing  sharply  throughout  the  morning,  dipping  around 
lunch  time,  picking  up  slightly  in  the  afternoon,  and  then  dipping  off 
sharply  at  night.  The  situation  is  fairly  simple  in  this  example,  since 
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many  days  obaervatlona  arc  available,  showing  similar  variations, 
and  careful  pooling  gives  a  smoothed  tabulated  estimate  of  the  rate. 

This  rate  is  needed  to  model  scheduling  algorithms  for  the  computer. 

In  general,  the  problem  of  smoothing  the  point  process  to  obtain 
an  estimate  of  X  (t)  is,  in  my  opinion,  open,  despite  the  work  on  doubly 
stochastic  Poisson  processes  cited  in  Section  4,  which  ohould  be  applic¬ 
able.  It  is  generally  connected  with  the  statistical  problem  of  curve 
smoothing,  and  will  not  be  considered  further. 

Both  in  the  case  where  X  (t)  is  obtained  by  smoothing  or  from  a 
specific  model,  say  exp  for  +  pt),  it  will  often  be  necessary  to  test  the 
adequacy  of  both  the  trend  model  and/or  the  assumption  of  an  underlying 
nonhomogeneous  Poisson  process. 

For  example,  an  intensi'^e  care  unit  is  a  service  facility,  and  to 
provide  adequate  service,  one  must  know  something  about  the  arrival 
process;  it  might  be  very  regular  if  all  arrivals  are  from  surgical 
operations,  or  exhibit  clustering  if  all  arrivals  are  from  the  emergency 
room.  The  form  of  the  underlying  process  will  also  affect  the  test  for 
cyclic  trend  given  in  (5.  23),  the  attempt  to  estimate  the  spectrum  and 
filter  out  the  spike  usually  being  a  selfdefeating  process  (Bartle*^t,  1967; 
Lewis,  197  0). 

Both  formal  and  informal  methods,  usually  graphical,  are  re¬ 
quired  for  this  question,  analagous  to  the  similar  problem  in  regression 
analysis  which  is  usually  appi  cached  by  examining  the  residuals  a'ter 
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estimating  parameters  for  the  mean  value  in  the  model. 

While  keeping  in  mind  that  there  are  many  ways  to  approach 
this  in  different  situations,  the  following  proposal  seems  to  offer  a 
systematic  approach  to  examining  such  questions  (Lewis,  197  0).  It  is 
well  known  that  on  the  transformed  time  scale 

t 

T  *  /  X.  (u)du,  /5.  30) 

0 


a  nonhomogeneous  Poisson  process  with  rate  function  X.  (t)  becomes  a 
homogeneous  Poisson  process  of  rate  1.  Thus,  we  proposed  to  examine 
the  residual  process  where 


^i  ^ 

\{u)du. 


(5.31) 


and  \  (t)  is  some  estimate  of  X.  (t). 

There  are  a  number  of  ways  of  looking  at  this  transform. 


1)  If  X.  (t)  is  very  smooth  over  periods  compared  to  the  mean  time 

between  intervals,  we  can  write 


’■i  -•'i-i  ~ 


(5.  32) 


Then  a  logarithmic  transformation  reduces  this  to  standard  regression 
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models,  and  the  exponential  trend  functions  discussed  above  are  appealing 
since  we  have  (approximately)  standard  linear  regression  methods. 

These  are  considered  in  Cox  and  Lewis  (1966,  Chapter  3). 

Of  course,  in  data  such  as  the  arrival  data  of  Section  1,  this 
approximation  is  not  possible,  and  the  integral  transformation  must 
be  performed,  perhaps  numerically. 

The  real  problem  comes,  of  course,  when  X  (t)  cannot  be 
assumed  to  be  smooth  over  suitably  short  intervals  and  no  simple 
parametric  form  for  \  (t)  is  available. 

2)  If  the  analysis  is  based  on  the  conditional  distribution  of  the 

t^'s,  given  N{tQ)  =  n,  then  the  transformation  (5,  31)  is  seen  from  (5.  2) 
to  be  just  proportional  to  the  probability  integral  transform  for  the  density 
X  (t)/A(t)  with  estimated  parameters. 

3)  There  is  also  a  possibility  of  lookir.g  at  (5.  31)  as  a  filter  in 
the  spectral  domain. 

Formal  methods  for  analyzing  the  are  difficult  and  will  be 
discussed  elsewhere.  This  is  simplest  for  X  (t;^)  when  sufficient 
statistics  for  0  are  available.  The  probability  structure  of  the 
process,  given  the  sufficient  statistics,  is  independent  of  £.  Of 
course,  in  the  simplest  case  of  X  (t)  »  X  ,  with  X  *  well 


(k4 


known  that  the  are  the  gap  atatiatica  for  independent  uniforirj 

random  variablea  and  are  aaymptotically  exponentially  diatributed 
and  independent  (Pyke,  1972).  For  amall  aamplea,  they  will  gener¬ 
ally  be  correlated  random  variablea. 

Informal,  graphical  methods  are  useful  and  are  discussed  in 
the  context  of  the  arrival  data  given  in  Section  1.  The  first  section 
of  this  data  was  transformed,  using  (5.  31)  and  an  exponential  linear 
trend,  and  discussed  in  Lewis  (197  0).  No  test  for  Poisson  processes 
performed  in  the  SASE  IV  program  gave  significant  indication  of 
departure.  The  apectrum  of  the  transformed  data  is  shown  in  Lewis 
(1970). 

Clearly  the  (exponential)  linear  trend  plus  day  cycle  is  not  adequate 
for  the  complete  set  of  data,  as  we  saw  in  Section  I.  We  examine  its  adequacy 
further  here.  Figure  9  shows  a  plot  of  tne  average  of  successive  groups  of 
20  X^'s  after  detrending  with  the  (exponential)  linear  trend  plus  day  cycle.  The 
model  for  X(t)  is  clearly  inadequate.  Figure  lO  shows  the  spectrum  of 
counts  after  the  linear  transformation.  There  is  no  spike  corresponding 
to  the  one  day  period  and  no  indication  of  departure  from  a  Poisson  process. 
Similarly,  the  tests  discussed  in  Section  3  for  Poisson  processes  do  not  give 
any  great  indication  of  departures.  However,  the  estimated  asymptotic  slope 
of  the  variance-time  curve  is  approximately  8,  and  since  this  quantity  la  re¬ 
lated  to  the  rate  of  change  of  the  low  frequency  end  of  the  sF>ectrum,  there  is 
again  un  indication  of  inadequacy  of  the  form  of  X(t)  or  th*  ypothesis  of  a 


nonhomogencous  Poisson  process. 
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A  quadratic  term  was  added  to  the  rate  X  (t),  giving  the  trans¬ 


formation 


t 

-  i  ^  2  ^  t  , 

'.  =  f  exp  {or  +  Pu  +  -yu  +  k  sin(2TTU  +  0)  }du, 

^  0 


(5.33) 


where  the  -v ■  lues  of  the  estimated  parameters  are  given  in  Section  5.4. 
The  plot  ','^jrresponding  to  Figure  9  is  shown  for  the  (exponential) 
quadratic  detrending  in  Figure  11.  This  and  other  results  give  a  much 

clearer  picture  of  a  possible  cycle  or  quasi-cycle  in  the  data  with  period 

\ 

of  about  a  year  and  a  half.  The  examplte  is  discussed  in  more  detail  in 
Lewis  (1971),  where  local  smoothing  is  used  to  estimate  X  (t)  for  this 
quasi-cycle  and  test  the  nonhomogeneous  Poisson  hypothesis. 


0.305 


p(<Jp=27rp/to)  O'’  PERIOD  T  IN  DAYS 


Figure  10.  Arrivkl  of  patients  at  an  intensive  care  unit 
pectrum  of^coi^ts  for  complete  record  after  detrending  with 

T  =J^exp  {^+^\>ksin(2.u^e)}du.  where  a'.  /?.  k.  0  are  maximum 

a'p^brult; ‘m/'^Lnds'  ^  '963  to 

for  .ndividual  value!  ""  -PProxtmatelj  0.  95  and  0.  99  confidence  region 


Figure  11.  Arrival  of  patients  at  an  intensive  care  imif 

-/fer  detrending  with 
•Jn 


exp  {a%  ^u.  ^u^  ksin(2.u.  9)  }  du.  where  o".  ^  maxi- 


mum^ikelihood  estimates. 


mum  iiKeiihood  estimates,  n  =  1458  arrivals  in  t  .  mjo  ^  r 

Horn;?  averagTaVe"  ^ 

Homogeneity  of  variance  statistic:  ^  _ _ *•  *.  .  . 


mean  71.  ir  -  11  o?l 
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6.  CONCLUSIONS 

Much  further  work  ii  needed  in  the  ■tatiitlcal  analyaia  of  aerlea 
of  eventa.  Particular  areaa  that  come  to  mind  are  the  analoguea  of 
the  procedure?  in  Section  5  for  modulated  renewal  proceaaea.  Likeli¬ 
hood  functiona  can  be  written  down,  but  results  are  fairly  difficult  to 
obtain  from  the  expressions  (see  Cox  (197  2)  for  details).  Both  for  these 
processes  and  the  nonhomogeneous  Poisson  process  there  is  a  great 
deal  of  work  to  be  done  to  justified  estimation  and  testing  procedures  based 
on  random  oample  sizes.  Some  of  these  justifications  have  been  given  by 
Brown  ( 197  2). 

Finally,  it  is  hoped  that  new  methods  will  be  developed  which  will 
allow  some  of  the  nonrenewal  point  processes  to  be  analyzed  in  a  more 
systematic  way. 
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